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Abstract 

Suppose one has access to oracles generating samples from two unknown probability distri- 
butions p and q on some iV-element set. How many samples does one need to test whether the 
two distributions are close or far from each other in the ii-norm? This and related questions 
have been extensively studied during the last years in the field of property testing. In the present 
paper we study quantum algorithms for testing properties of distributions. It is shown that the 
Li-distance \\p — q\\i can be estimated with a constant precision using only 0(N 1 ^ 2 ) queries in 
the quantum settings, whereas classical computers need ^(iV 1-0 ^ 1 )) queries. We also describe 
quantum algorithms for testing Uniformity and Orthogonality with query complexity 0(N 1 ^ 3 ). 
The classical query complexity of these problems is known to be f^A 1 / 2 ). A quantum algorithm 
for testing Uniformity has been recently independently discovered by Chakraborty et al [13]. 

1 Introduction 

1.1 Problem statement and main results 

Suppose one has access to a black box generating independent samples from an unknown probability 
distribution p on some iV-element set. If the number of available samples grows linearly with N, 
one can use the standard Monte Carlo method to simultaneously estimate the probability p^ of 
every element i = 1, . . . , N and thus obtain a good approximation to the entire distribution p. On 
the other hand, many important questions that one usually encounters in statistical analysis can 
be answered using only a sublinear number of samples. For example, deciding whether p is close in 
the Li-norm to another distribution q requires approximately N 1 / 2 samples if q is known [6] and 
approximately iV 2 / 3 samples if q is also specified by a black-box [7] . Another example is estimating 
the Shannon entropy H{p) = — ^jPilog 2 Pi- It was shown in [10, 19] that distinguishing whether 
H(p) < a or H{p) > b requires approximately N~s samples. Other examples include deciding 
whether p is close to a monotone or a unimodal distribution [9], and deciding whether a pair of 
distributions have disjoint supports [14]. These and other questions fall into the field of distribution 
testing [8, 19] that studies how many samples one needs to decide whether an unknown distribution 
has a certain property or is far from having this property. The purpose of the present paper is 
to explore whether quantum computers are capable of solving distribution testing problems more 
efficiently 
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The black-box sampling model adopted in [6, 7, 10, 9, 8, 19] assumes that a tester is presented 
with a list of samples drawn from an unknown distribution. What does it mean to sample from 
an unknown distribution in the quantum settings? Let us start by casting the black-box sampling 
model into a form that admits a quantum generalization. Suppose p is an unknown distribution 
on an iV-element set [N] = {1, . . . , N} and let S be some specified integer. We shall assume that 
p is represented by an oracle O p : [S] — > [N] such that a probability pi of any element i £ [N] is 
proportional to the number of elements in the pre-image of i, that is, the number of inputs s 6 [S] 
such that O p (s) = i. In other words, one can sample from p by querying the oracle O p on a random 
input s 6 [S] drawn from the uniform distribution 1 . Note that a tester interacting with an oracle 
can potentially be more powerful due to the possibility of making adaptive queries which could 
allow him to learn the internal structure of the oracle as opposed to the black-box model. However, 
it will be shown below (see Lemma 9 in Section 6) that the oracle model and the black-box model 
are in fact equivalent. More precisely, for any fixed N one can always choose sufficiently large S 
such that a tester will need the same number of queries in both models. 

The oracle model admits a standard quantum generalization. Specifically, we shall transform 
the oracle O p into a reversible form by keeping a copy of the input and writing the output of O p 
into an ancillary register. A quantum oracle generating p is a unitary operator whose action on 
basis vectors coincides with the reversible version of O p , as we will explain further in Section 2. 

The present paper focuses on testing three particular properties of distributions, namely, Sta- 
tistical Difference, Orthogonality, and Uniformity. The corresponding property testing problems 
are promise problems so that a tester is required to give a correct answer (with a bounded error 
probability) only for those instances that satisfy the promise. 

Problem 1 (Testing Uniformity). 

Instance: Integers N,S, precision e > 0. Access to an oracle generating a distribution p on [N]. 
Promise: Either v is the uniform distribution or the L\-distance between p and the uniform distri- 
bution is at least e. 
Decide which one is the case. 

Problem 2 (Testing Orthogonality). 

Instance: Integers N,S, precision e > 0. Access to oracles generating distributions p,q on [N]. 
Promise: Either p and q are orthogonal or the L\-distance between p and q is at most 2 — e. 
Decide which one is the case. 

Problem 3 (Testing Statistical Difference). 

Instance: Integers N, S, thresholds < a < b < 2. Access to oracles generating distributions p and 
q on [N]. 

Promise: Either \\p — q\\i < a or \\p — q\\i > b. 
Decide which one is the case. 

We assume that the precision e is bounded from below by a fixed constant independent of N, 
for instance, e > 1/10. The same applies to the decision gap b — a for testing Statistical Difference. 
Given a function f(N) we shall say that a property is testable in f(N) queries if there exists a 
testing algorithm making at most f(N) queries that gives a correct answer with a sufficiently high 

1 Although in this model probabilities pi can only take values that are multiples of 1/5, choosing sufficiently large 
S allows one to represent any distribution p with an arbitrarily small error. 
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probability (say 2/3) for any distributions p, q satisfying the promise and for any oracles 2 specifying 
p and q. If a promise is violated, a tester can give an arbitrary answer. 
Our main results are the following theorems. 

Theorem 1. Statistical Difference is testable on a quantum computer in 0(N 1 / 2 ) queries. 

Theorem 2. Uniformity is testable on a quantum computer in 0(N 1 ^ 3 ) queries. 

Theorem 3. Orthogonality is testable on a quantum computer in ©(A 1 / 3 ) queries. 

It is known that classically testing Orthogonality and Uniformity requires ^(.A 1 / 2 ) queries, 
see Sections 6.2 and 6.3, while Statistical Difference is not testable in 0(N a ) queries for any 
a < 1, see [19]. Therefore quantum computers provide a polynomial speedup for testing Uniformity, 
Orthogonality, and Statistical Difference in terms of query complexity. 

Testing Orthogonality is closely related to the Collision Problem studied in [12, 1]. In Section 6.2 
we describe a randomized reduction from the Collision Problem to testing Orthogonality. Using 
the quantum lower bound for the Collision Problem due to Aaronson and Shi [3] we obtain the 
following result. 

Theorem 4. Testing Orthogonality on a quantum computer requires f^A 1 / 3 ) queries. 

Quite recently Chakraborty, Fischer, Matsliah, and de Wolf [13] independently discovered a 
quantum Uniformity testing algorithm with query complexity 0(N 1 / 3 ) and proved a lower bound 
f^A 1 / 3 ) for testing Uniformity. These authors also presented a quantum algorithm for testing 
whether an unknown distribution p coincides with a known distribution q with query complexity 
OiN 1 / 3 ). 

1.2 Discussion and open problems 

One motivation for studying distribution testing problems is that testing Orthogonality and Statis- 
tical Difference are complete problems for the complexity class SZK (Statistical Zero Knowledge). 
More precisely, the following problem known as Statistical Difference was shown to be SZK-complete 
by Vadhan [16]: 

Input: description of classical circuits C p ,C q that implement oracle functions O p ,O q : [S] — > [A] 
and a pair of real numbers < a < b < 2 such that 2a < b 2 . 

Problem: Decide whether \\p — q\\i > b (yes-instance) or \\p — q\\i < a (no-instance). 
The class SZK includes many interesting algebraic and graph theoretic problems such as Discrete 
Logarithm, Graph Isomorphism, Graph Nonlsomorphism, Quadratic Residuosity, and The Shortest 
Vector in Lattice, see [4] and references therein. Thus it is natural to ask whether quantum 
computers provide a universal speedup for problems in SZK similar to the square-root speedup 
for problems in NP provided by the Grover search algorithm. Assuming that the circuits C p , C q 
have size poly (log (N)), one can easily translate the testing algorithm described in Section 3 to 
a quantum circuit of size 0(y/~N) solving Statistical Difference problem for any constants a, b as 
above. On the other hand, any classical algorithm treating the circuits C p , C q as black boxes would 
need roughly A 1-0 ^ 1 ) queries, see [19], thus requiring a circuit of size ^(A 1-0 ^ 1 )). 

Note that the Statistical Difference problem with b = 2 is equivalent to testing Orthogonality. 
It can be solved classically in time 0(A X / 2 ) using the classical collision finding algorithm. Un- 
fortunately, the circuit complexity of the quantum Orthogonality testing algorithm described in 

2 Note that according to this definition a tester needs at most f(N) queries even in the limit S ^ oo. 
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Section 5 may be different from its query complexity since it uses a quantum membership oracle 
for a randomly generated set. It is an open problem whether Statistical Difference problem with 
6 = 2 can be solved by a quantum circuit of size 0{N 1 ^), although with a suitably powerful model 
of quantum RAM, such membership queries can be done in time polylog(iV). A related question 
is that of space-time tradeoffs: our algorithms generally require storing classical bits and 

then querying them with quantum algorithms that use poly(log(iV) qubits. We suspect that this 
amount of storage cannot be reduced without increasing the run-time, but do not have a proof 
of this conjecture. Similar issues of quantum data structures for set membership and conjectured 
space-time tradeoffs have arisen for the element distinctness problem[5, 15]. 

It is worth mentioning that all distribution properties studied in this paper are symmetric, that 
is, these properties are invariant under relabeling of elements in the underlying set {1,... ,N}. 
Testing symmetric properties of distributions is equivalent to testing properties of functions from 
[S] to [N] that are invariant under any permutations of inputs and outputs of the function. It was 
recently shown by Aaronson and Ambainis that quantum computers can provide at most polynomial 
speedup for testing properties of such symmetric functions [2]. 

More interesting than the mere fact of polynomial speedups provided by Theorems 1,2,3 is 
the way in which our algorithms achieve it. Classically, the results of Ref. [19] provide a simple 
characterization of an asymptotically optimal testing algorithm for any symmetric property of a 
distribution (satisfying certain natural continuity conditions). By contrast, our algorithms use a 
variety of different strategies both to query the oracles and to analyze the results of those queries. 
These strategies appear not to be special cases of the quantum walk framework which has been 
responsible for most of the polynomial quantum speedups found to date [18, 17]. A major challenge 
for future research is to give a quantum version of Ref. [19] 's Canonical Tester algorithm; in other 
words, we would like to characterize optimal quantum algorithms for testing any symmetric property 
of a distribution (or a pair of distributions) . 

Finally, let us remark that the algorithm for estimating statistical difference described in Sec- 
tion 3 can be easily generalized to construct a quantum algorithm for estimating the von Neumann 
entropy of a black-box distribution with query complexity 0(N 1 / 2 ). Using similar ideas one can 
construct an 0(A rl / 2 )-time algorithm for estimating the fidelity between two black-box distributions 

The rest of the paper is organized as follows. Section 2 introduces necessary notations and 
basic facts about the quantum counting algorithm by Brassard, Hoyer, Mosca, and Tapp [11]. The 
distribution testing algorithms described in the rest of the paper are actually classical probabilistic 
algorithms using the quantum counting as a subroutine. Theorem 1 is proved in Section 3. Theo- 
rem 2 is proved in Section 4. Theorem 3 is proved in Section 5. We discuss lower bounds for the 
above distribution testing problems in Section 6. 

2 Preliminaries 

Let T>n be a set of probability distributions p = (pi, . . . ,pn) such that a probability pi of any 
element % G [N] is a rational number. Let us say that an oracle O : [S] — ► [N] generates a 
distribution p G T>n iff for all i G [N] the probability pi equals the fraction of inputs s G [S] such 
that O(s) = i, 

p, = l#{ S G [S] : 0(s) = i}. 
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Note that the identity of elements in the domain of an oracle O is irrelevant, so if O generates p 
and a is any permutation on [S] then O oa also generates p. By definition, any map O : [S] — > [N] 
generates some distribution p G T>n- 

For any oracle O : [S] — > [iV] we shall define a quantum oracle O by transforming O into a 
reversible form and allowing it to accept coherent superpositions of queries. Specifically, a quantum 
oracle O is a unitary operator acting on a Hilbert space C s (8> C^" 1 " 1 equipped with a standard basis 
{\s) <g) |z)}, s G [S], i G {0} U [iV] such that 

O | a ) (g) |0) = | s ) |0(s)> for all s G [S]. (1) 

In other words, querying O on a basis vector |s) (g) |0) one gets the output of the classical oracle O(s) 
in the second register while the first register keeps a copy of s to maintain unitarity. The action 
of O on a subspace in which the second register is orthogonal to the state |0) can be arbitrary. 
We shall assume that a quantum tester can execute operators O, & and the controlled versions of 
them. Execution of any one of these operators counts as one query. 

We shall see that all testing problems posed in Section 1 can be reduced (via classical randomized 
reductions) to the following problem. 

Problem 4 (Probability Estimation). Given integers S,N, description of a subset A C [N], 
precision 5, error probability uj, and access to an oracle generating some distribution p G V^. Let 
PA = ^2i^APi be the total probability of A. One needs to generate an estimate pa satisfying 

Pi[\Pa-Pa\ <S] > (2) 

Our main technical tool will be the quantum counting algorithm by Brassard et al. [11]. Specif- 
ically, we shall use the following version of Theorem 12 from [11]. 

Theorem 5. There exists a quantum algorithm EstProb(p, A, M) taking as input a distribution 
p G T>n specified by an oracle, a subset A C [N], and an integer M. The algorithm makes exactly 
M queries to the oracle generating p and outputs an estimate pa such that 

Pt[\pa-Pa\ <S] > 1-uj (3) 
for all 5 > and < to < 1/2 satisfying 



Here c = O(l) is some constant. If pa = then pa = with certainty. 

Proof. Let O : [S] — ► [N] be the oracle generating p. Using one query to O and one query to 
one can implement a phase-flip oracle Wa '■ C s — > C s such that 



W A \s) 



\s) if O(s) G A, 
\s) if O(s) i A. 



Theorem 12 from [11] implies that for any integer M' > 1 there exists a quantum algorithm using 
an operator A(Wa) exactly M' times that outputs an estimate pa (0 < pa < 1) satisfying 



Pr 



\PA ~ PA\ < 27Tk — + k ^ 



^-2(A-1) ^ 
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for all integers k > 2. Moreover, if pa = then pa = with certainty. 

Choosing fc as the smallest integer such that k > 1 + l/2u> and M = 2M' we conclude that 
Eq. (4) holds whenever 

JL — - < cd and < c 

for some constants c',c". This is equivalent to Eq. (4). 

□ 



3 Quantum algorithm for estimating statistical difference 

In this section we prove Theorem 1. Let p,q £ T>n be unknown distributions specified by oracles. 
Define an auxiliary distribution r £ T>n such that T{ = (pi + %)/2 for all i € [iV]. If we can sample 
i from both p and g then by choosing randomly between these two options we can also sample i 
from r. Let x € [0, 1] be a random variable which takes value 

_ \Pi ~ Qi\ 
Pi + qi 

with probability r^. It is evident that 

E ( x ) = riXi = \ Y \ pi ~ qi \ = \ ii p ~ ^iii- ( 6 ) 

ie[N] ie[N] 

Thus in order to estimate the distance \\p — q\\\ it suffices to estimate the expectation value E(x) 
which can be done using the standard Monte Carlo method. Since we have to estimate E(x) only 
with a constant precision, it suffices to generate O(l) samples of Xj. Given a sample of i (which is 
easy to generate classically) we can estimate X{ by calling the probability estimation algorithm to 
get estimates of Pi and It suggests the following algorithm for estimating the distance \\p— q\\\. 

EstDist(p, q, e, r) 

Set n = 27/re 2 , M = cVN/e 6 r A . 

Let i±, . . . ,i n £ [N] be a list of n independent samples drawn from r. 
For a = 1, . . . , n 

{ 

Let pi a be estimate of p. la obtained using EstProb(p, {i a }, M). 
Let qi a be estimate of qi a obtained using EstProb(g, {i a }, M). 
Let x ia = \p ia - q ia \/(p ia + q ia ) be estimate of x ia . 

} 

Output x = (1/n) ^"=i Xia- 

Here c = O(l) is a constant whose precise value will not be important for us. 
Lemma 1. The algorithm EstDist(p, q, e, r) outputs an estimate x satisfying 

Pt[\x-E{x)\ < e] > 1-t, (7) 

where E(ar) = (l/2)||p — q\\\. 
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Proof. Define a random variable 

1 n 

0=1 

where i±, . . . , i n is a list of samples generated at the first step of the algorithm. Note that E(x) = 
E(x) and Var (x) = Var (x)/n. As \pi — qi\ < pi + qi we have < Xi < 1 and so one can bound the 
variance of x as Var (x) < E(x 2 ) < 1. Therefore Var (x) < 1/n. Applying the Chebyshev inequality 
to x one gets 

9 Var (a) _9_ 
e 2 ~~ ne 2 — 3 

Let x be the output of EstDist(p, q, e, r). The union bound implies that 



Pr [\x - E(x)| > 6/3] < < " < I. (8) 



Pr [|x — x| > e/3] < Pr [3a : \xi a — Xi a \ > e/3n] < raPr [\xi — xi\ > e/3n] , (9) 
where i = i a is a sample drawn from r. Therefore it suffices to verify that 

PT[\xi-Xi\ >e/3n] < (10) 



Let us say that an element i is 6ad iff 

max (p 

The probability that i is bad is at most 



T 

max(pi,qi) < — — (bad element). (11) 



r 

"i < 

i is bad 



Pbad= E r ^^ 



Therefore it suffices to get a bound 



T 

Pr [\xi — Xi\ > e/3n \ i is good] < — , (12) 

where we conditioned on i being a good (not bad) element. 

Let us translate the precision up to which one needs to estimate Xi into a precision up to which 
one needs to estimate pi and qi. 

Proposition 1. Consider a real-valued function f(p,q) = (p — q)/(p + (?) where < p,q < 1. 
Assume that \p — p\, \q — q\ < 6{p + q) for some 5 < 1/5. Then 

\f(p,q)-f(p,q)\<5S. (13) 

Proof. Assume without loss of generality that p > q. Computing the partial derivatives of f(p,q) 
one gets 

V(P,9) = £^2. d q f(p,q)=- J -f ¥ 
both of which have absolute value at most 2/(p + g) .It follows that 

\f(p,q)-f(p,q)\ < . s I (\p-p\ + \q-q\)- 

mm {p + q,p + q) 

The condition of the lemma implies that p + q > (p + 1)(1 — S), so that 



\f(p,q)-f(P,q)\< 1 ^<55. 



□ 
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Note that 

\xi ~Xi\ = \ \f(pi,qi)\ - \f(pi,qi)\\ < \f(Pi,q~i) - f(Pi,qi)\- 

Since we want to estimate Xi with a precision e/3n, it suffices to estimate pi and qi with a precision 
5(pi + qi) > 5 max (pi, qi) where 55 = e/3n, that is, 5 = e/(15n). Summarizing, 

\Pi ~Pi\,\q~i ~q%\ < Y^max(pi,gi) =>■ \x { - x { \ < (14) 

Thus it suffices to estimate pi and qi with precision 

5 ~ en -1 max (pj, ~ re 3 max (p;, %). (15) 

We are going to get these estimates by calling EstProb(p, {i}, M) and EstProb(g, {i},M). The 
number of queries M has to be chosen sufficiently large such that conditions Eq. (4) are satisfied 
for precision 6 defined in Eq. (15) and error probability determined by Eq. (12), that is, 

to ~ rn" 1 ~ r 2 e 2 . (16) 

It leads to the condition 



Recall that we are interested in the case when i is good. In this case max (pi,qi) > r/(3nN) ~ 
N~ 1 r 2 e 2 . Therefore Eq. (17) is satisfied whenever 



M > n 



r 4 £ 6 



□ 

Theorem 1 follows directly from Lemma 1 since EstDist(p, q, e, r) makes 0{^/N) queries to the 
quantum oracles generating p and q. 



4 Quantum algorithm for testing Uniformity 

In this section we prove Theorem 2. Let p G T>n be an unknown distribution specified by an 
oracle. We are promised that either p is the uniform distribution, or p is e -nonuniform, that is, the 
Li-distance between p and the uniform distribution is at least e. The algorithm described below is 
based on the following simple observation. Choose some integer M <C N and let S = (ii, . . . , im) 
be a list of M independent samples drawn from the distribution p. Define a random variable 
ps = Yl^LiPia- It coincides with the total probability of all elements in S unless S contains a 
collision (that is, i a = % for some a / b). The characteristic property of the uniform distribution 
is that ps = M/N with certainty. On the other hand, we shall see that for any e-nonuniform 
distribution p$ takes values greater than (1 + 5)M/N for some constant 5 > depending on e 
with a non-negligible probability. This observation suggests the following algorithm for testing 
uniformity (the constants K and M below will be chosen later). 
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UTest(p, K, M, e) 

Let S = (ii, . . . , im) be a list of M independent samples drawn from p. 

Reject unless all elements in S are distinct. 

Let ps = T^!=i Via be the total probability of elements in S. 

Let ps be an estimate of ps obtained using EstProb(p, S, K). 

If PS > (1 + e 2 /8)M/N then reject. Otherwise accept. 



This procedure will need to be repeated several times to achieve the desired bound on the error 
probability, see the proof of Theorem 2 below. 

The main technical result of this section is the following lemma. 

Lemma 2. Let p £ T>n be an e-nonuniform distribution. Let S = (h, ■ ■ ■ ,im) be a list of M 
independent samples drawn from p, where 

M= = f^. (18) 

Let p s = Y,a=\ Pia and a = 2 8 e~ 4 . Then 

Pr 



9 , , M 

PS>(l + e 2 /2) — 



>I eX p(-«). (19) 



Theorem 1 follows straightforwardly from the above lemma and Theorem 5. 
Proof of Theorem 1. Let M be chosen as in Eq. (18) and 

where c = 0(1) is a constant to be chosen later. Consider the following algorithm: 



Perform L = 4exp (a) independent tests UTest(p, K, M, e). If at least one of the tests outputs 'reject' then 
reject. Otherwise accept. 

Let us show that this algorithm rejects any e-nonuniform distribution with probability at least 
2/3 and accepts the uniform distribution with probability at least 2/3. 

Part 1: Any e-nonuniform distribution is rejected with high probability. Let P s be the probability 
that for at least one of the UTests one has 

^>(l + 6 2 /2)^ (20) 

Using Lemma 2 we conclude that 

'.^-(i-srffci-^i- (21) 

In what follows we shall focus on a single test UTest(p, K, M, e) that satisfies Eq. (20) and show 
that it outputs 'reject' with high probability. Indeed, let S be the sample list generated by this 
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UTest. If S contains a collision, the test outputs 'reject'. Otherwise ps coincides with the total 
probability of all elements in S. The test outputs 'reject' whenever ps is estimated with a precision 



e 2 



PS j- (22) 



In this case 

e 2 \ / e 2 \ / e 2 \ M ( e 2 \ M 



(Here we assumed for simplicity that e < 1.) Suppose we want the UTest to output 'reject' with 
probability at least 5/6. Applying Eq. (4) with 5 defined in Eq. (22) and w = 1/6 we arrive at 

K > (23) 
for some constant c = 0(1). Using Eq. (20) it suffices to choose 



e 



2 v / M ) \ e 4 / 3 



Summarizing, if p is an e-nonuniform distribution it will be rejected with probability at least 
(5/6) 2 > 2/3. 

Part 2: The uniform distribution is accepted with high probability. Note that the uniform distri- 
bution can be rejected for two possible reasons: (i) for some UTest the sample list S contains 
a collision; (ii) for some UTest the estimate ps is sufficiently large, ps > (1 + e 2 /8) M/N. We 
analyze these two possible sources of errors below. 

(i) For any fixed Utest let S = (ii, . . . , %m) be a list of M samples drawn from p. Let C be the 
number of collisions in S, that is, the number of pairs 1 < a < b < M such that i a = %. Then, 



E(C) 



M\A , M 2 



2 ) 2^ Pi ~ 2.V 

7 i=i 



Markov's inequality implies that Pr [C > 1] < E(0) < M 2 /(2N). Then the probability that at 
least one of the UTests will find a collision can be bounded using the union bound as 

„ LM 2 „ ( 1 



2N V^ 1/3 

since we have chosen M = 0(N 1 ^ 3 ) and L = 0(1). Thus the error probability associated with 
finding collisions can be neglected. 

(ii) Let ps be the estimate of ps obtained in some fixed UTest. Since ps = M/N with 
certainty, the test outputs 'accept' whenever the estimate ps returned by EstProb(p, S, K) satisfies 
\Ps — Ps\ < <5> where 

(25) 

8N V ; 
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Since the total number of Utests is L = 4e Q , we would like the estimate ps to have precision 5 with 
error probability u < y^e~ a . Applying Eq. (4) with S, u> defined above and taking into account 
that ps = M/N, we find that we can take the number of queries K to be 




It remains to choose the largest of Eq. (24) and Eq. (26). □ 

In the rest of this section we prove Lemma 2. We shall adopt notations introduced in the 
statement of Lemma 2, that is, the number of samples M is defined by 

M 3 = 32e~ 4 iV, 

a = 2 8 e~ 4 , S = (ii,..., %m) is a list of M independent samples drawn from p, and p$ = Y^=\Pi a - 
Definition 1. An element i G [N] is called big iff Pi > 1/(2M 2 ). 

Define the set Big C [N] of all big elements and their total probability: 

Big = {i e [N] : Vl > 1/(2M 2 )}, w hig = ^ Pi . (27) 

ieBig 

We shall start in see subsection 4.1 by proving Lemma 2 for the special case when p has no 
big elements. The proof is based on Chebyshev's inequality. Then we shall leverage this result in 
subsection 4.2 to show that distributions with a few big elements (small u>bi g ) also satisfy Lemma 2. 
Finally in subsection 4.3, we shall treat distributions with many big elements (large u>bi g ) using a 
completely different technique. 

4.1 Proof of Lemma 2: no big elements 

Lemma 3 (No big elements). Suppose p £ T>n is e-nonuniform and has no big elements. Then 



Pr 



e 2 \ M 



> \- (28) 



Proof. One can easily check that 

/ N 



E(p s ) = M(p\p), Vzr (ps) = M [J2p! ~ (P\P) 2 ) ■ (29) 



\i=l / 
Proposition 2. Suppose p £ T>n is e-nonuniform. Then 

(p\p) > (30) 

Proof. Let u be the uniform distribution. Then e < \\p — u\\\ < y/N \\p — u\\2 = V^V \J (p\p) — iV -1 
which gives the desired bound. □ 
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Using the proposition and the assumption that p has no big elements we get 



M 



1 



E(p s ) > (1 + e 2 ), Var (p s ) < MM^p) < ^(p|p) 



Chebyshev's inequality implies that 



2M 



Pr[|ps-E(p s )| >tE(p s )] < 



Var (p s ) 



Hps) 2 *' 

Assuming for simplicity that e 2 < 1/3 we can use the bound (1 + e 2 ) -1 < 1 — 3e 2 /4 and thus 
.2\ /um r (l + e 2 /2) 



(31) 



(32) 



Pr 



PS < 1 + 



TV 



< Pr 



< E(ps) 



(1 + 6 2 ) 



< Pr [ Ps < (1 - e 2 /4)E(p s )] • 



Using Eq. (32) with t = e 2 /4 and Eqs. (29,31) we arrive at 



Pr [ps < (M/iV)(l + e 2 /2)] < 



(p\p) 



1 



8N 1 

< =— r < - 



2M M 2 (p\p) 2 t 2 ~ M 3 e 4 ~ 4 
since > A^ _1 for any distribution p G Pat and since we have chosen M 3 = 32e _4 iV. 



□ 



4.2 Proof of Lemma 2: a few big elements 

Lemma 4 (A few big elements). Suppose p £ T>n is e-nonuniform and has only a few big 
elements such that 

(33) 
(34) 



a 



Whig < — , a = 2 e 



— o8^-4 



M 



Then 



Pr 



M 



P5 > (1 + 672) — 



> j ex P(-«)- 



Proof. Let S* = . . ,im) be a list of M samples drawn from p. We can get a constant lower 
bound on the probability that S contains no big elements: 



Pr [S n Big = 0] = (1 - w hig ) M w exp (-Mw big ) > e~ a 



(35) 



(Strictly speaking, one gets a lower bound e~ a (l — o(l)).) It suffices to show that ps > (1 + 
e 2 /2)M/N with probability at least 1/2 conditioned on S* having no big elements. 

The conditional distribution of the random variable ps given that S contains no big elements 
can be obtained by setting the probability of all big elements to zero and renormalizing p by a 
factor (1 — ^big) -1 - I n other words, we can repeat all arguments of Lemma 3 if we replace p by a 
new distribution p' £ T>n such that 



P'i 



(T^7 if ^ Bi s> 

if i G Big. 



(36) 



We have to check that p' is also e-nonuniform. 

Proposition 3. The distribution p' is e' -nonuniform, where e' > e — 0(N^ 1 ^ 3 ). 
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Proof. 



\\P ~ P'Wl = Pi+ ~ Whi ^ 1 ~ l ]Pi- ^big + 



Wbig 



(1 - Wbig) 

Let u be the uniform distribution. Using the triangle inequality we get 

-«||i > ||p-«||i - \\p-p'\\i > e-0(iV" 1/3 ). 



0(N 



-l/3> 



□ 

To simplify notations we shall neglect the correction of order iV^ 1 / 3 and assume that p' is 
e-nonuniform. By construction, 



Ib'IU < 



(1 - Wbig )2M 2 



1/(2M 2 ) + 0(N~ 1 ) 



Neglecting the correction of order TV -1 we can assume that p' has no big elements. Then Lemma 3 
implies that p' s > (1 + e 2 /2)M/N with probability at least 3/4. Combining it with Eq. (35) we 
arrive at Eq. (34). □ 



4.3 Proof of Lemma 2: many big elements 

Lemma 5 (Many big elements). Suppose p is e-nonuniform and has many big elements such 
that 

(37) 



a 



Whig >jj, a = 2 8 e 4 



Then 



Pr 



M 

f S >2- 



1 

> -. 

~ 2 



(38) 



Proof. Let S = (ii, . . . ,«m) be a list of M independent samples drawn from p. Since each big 
element contained in S contributes at least 1/(2M 2 ) to ps, the inequality ps > 2M/N is satisfied 
whenever S contains at least n big elements where 



n 2M 



Since M 3 = 2 5 e 4 iV, we can choose 



2M 2 ~ TV 



n = 2V 4 = a/2. 



(39) 



The total number of samples a G [M] such that i a is big can be represented as £ = £«j where 
£ {0, 1} is a random variable such that £j = 1 iff i is a big element. Note that E(£) = Mw^ig > a. 
Using Chebyshev's inequality we get 



Pr [f < n] < Pr 



|£-E(0|>^E(0 



4Var(£) 4 4 1 
< — < < — < - 

" E(£) 2 " E(£) " a " 2 



(40) 
□ 
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5 Quantum algorithm for testing orthogonality 



Consider distributions p, q £ T>n and let S = (ii, . . . , im) be a list of M independent samples drawn 
from p. Let A C [N] be the set of all elements that appear in S at least once. Define the collision 
probability 



1A = l^li- 

Note that qA is a deterministic function of A, so the probability distribution of qA is determined 
by probability distribution of A (which depends on p and M). For a fixed A the variable qA is the 
probability that a sample drawn from q belongs to A. 

Clearly if p and q are orthogonal then qA = with probability 1. On the other hand, if p and q 
have a constant overlap, we will show that qA takes values of order M/N with constant probability 
Specifically, we shall prove the following lemma. 

Lemma 6. Consider a pair of distributions p,q £ T>n such that \\p — q\\\ < 2 — e. Let qA be a 
collision probability constructed using M samples. Suppose M > 2 9 e~ 2 . Then 



Pr 



e 3 M 

QA > 



2 n N 

It suggests the following algorithm for testing orthogonality. 



> i (41) 



OTest(p, q,M, K) 

Let S = {ii, . . . , iu} be a list of M independent samples drawn from p. 

Let A C [N] be the set of elements that appear in S at least once. 

Let qA = XlieA 9» ^ e ^ ne total probability of elements in A with respect to q. 

Let qA be estimate of qA obtained using EstProb(g, A, K). 

If QA > <§tt^t then reject. Otherwise accept. 



We note that if qA = then qA = with certainty (see Theorem 5) and so OTest accepts any 
pair of orthogonal distributions with certainty. Theorem 3 is a direct consequence of the following 
lemma. 

Lemma 7. Choose 

(42) 




Then OTest(p, q, M, K) rejects any distributions p,q £ T>n such that \\p—q\\i < 2— e with probability 
at least 1/4. 

Proof. According Eq. (41), q A > e 3 M/(2 n iV) with probability > 1/2. When this holds, the 
algorithm rejects whenever 

i - I ^ QA 

\QA ~ QA\ < y 

since this implies qA > Qa/2 > e 3 M/(2 12 iV). Applying Theorem 5 with precision 5 = qa/2 and 
error probability uo = 1/2, we find (according to Eq. (4)), that K should be 

K> A^) <43) 
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Taking into account Eq. (41) it suffices to choose 

e 3/2 M l/2 I 



K = Q 



to guarantee that Otest outputs 'reject' with probability at least (1/2) • (1/2) = 1/4. Minimizing 
the total number of queries K + M we arrive at Eq. (42) . □ 

In the rest of this section we prove Lemma 6. 

Proof. Begin by defining two sets of indices: 



B = {i : qi < -p{\ 
C^{i: Pl <^N- 1 } 

Let B c , C c denote the complements of B and C respectively. We will prove that 



Pr 



which will imply the Lemma since 



MnB c nC c | > —M > 1/2, 

i 16 \ ~ 



ieAnB c nc c 



i&AnB c nc c 



2 7 N 



\AnB c nC c \. 



(44) 
(45) 

(46) 
(47) 



First, we show that \A n B\ is likely to not be too big. Observe that qs < \pb < f • Next use 
the fact that ^\\p — q\\i = max[/ c [jv] pu — qu < 1 — § to bound ps<l — | + | = 1 — |. Now we 
state a Chernoff-Hoeffding bound. 

Lemma 8. Let X\,... ,1m be independent 0, 1 random variables with X = J2i=i ^i- Then for 
any 5 > 0, 

Pr [X > E (X) + MS] < exp(-2M<5 2 ). (48) 

Recall that A consists of the unique elements of S = . . . ,?m}- For j = 1, . . . , M, define 
Xj = 1 if ij G B and Xj = if not. Then \A (~1 B\ < J2jL\ Xj, with the possibility of an inequality 
in case there are repeats. We can now use Lemma 8 with E {Xj) = pb<1 — e/4 and 5 = e/8 to 
prove that 



Pr 



\A n B\ > (l - ^ M < exp ^-2M ^ 



cxp 



Me' 
~32" 



(49) 



Next, we observe that pc < e/32. We can use the same method to show that \A n C\ is 
likely to not be too big. This time we define Xj = 1 iff ij £ C, so that \A PI C\ < YljLi Xj and 
E {Xj) =p c < e/16. Setting 5 = e/32 we get 



Pr 



Unci > — m 

1 1 ~ 16 . 



< exp 



M£ 



(50) 



When M > 2 9 /e 2 , we can combine (49) and (50) to find that with probability > 1/2, both 

Ys)M. Thus \A n B c n C c \ > j^M with probability at least 



\A n B c \ > |M and \ A n C c \ > (I 



1/2. This establishes (46), and completes the proof of the lemma. 



□ 
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6 Lower bounds 



6.1 Sampling vs query complexity 

Let p 6 T>n be any distribution and O : [S] — ► [TV] be an oracle generating p. Recall that 
Pi coincides with the fraction of inputs s € [5] such that O(s) = i. It does not matter which 
particular inputs s are mapped to i. The only thing that matters is the number of such inputs. 
Therefore one can choose an arbitrary permutation of inputs a : [S] — ► [S] and construct a new 
oracle O' = O o a that generates the same distribution p. We shall see below that if a classical 
testing algorithm A gives a correct answer with high probability for any choice of S and a then 
A cannot take any advantage from making adaptive queries to O. Let us transform A into a 
'sampling' algorithm A s such that each query made in A is replaced by a random query drawn 
from the uniform distribution on [S]. 

Lemma 9. Let A be any classical testing algorithm and p £ T>n be some distribution such that A 
accepts (rejects) p with probability at least 2/3 for any oracle O : [S] — > [N] generating p. Then 
the corresponding sampling algorithm A s accepts (rejects) p with probability at least 2/3. 

Proof. Let -P acc (c) be a probability that A accepts while interacting with the oracle O o a, where 
a is a permutation on [S]. Without loss of generality P aC c(o-) > 2/3 for all a. It implies that the 
average acceptance probability 

Pacc = ^Y,Pacc{<T)>\. (51) 
a 

An execution of the algorithm A can be represented by a history of queries Q = (si, . . . , st) £ [S] xT . 
Let P(Q) be a probability that an execution of A leads to a history Q. We can assume without 
loss of generality that the output of A (accept or reject) is a deterministic function of Q. Let r2 acc 
be a set of histories Q that make A to accept. We have P a cc(o~) = ^Qen acc -f( cr_1 Q)) where 

a-^^a- 1 ^!),...^- 1 ^)), 

and thus 

Let P(Q) = E(P(o"~ 1 Q)) where cr is drawn from the uniform distribution. Let U(Q) be the uniform 
distribution on the set [S'] xT . We claim that 

\\P-U\\ 1 = 0(TS- 1 ). (52) 

Assume without loss of generality that all queries in Q are different. Then 

HQ) = { -^P^ = s- T (i + 0(T 2 /S)). 

A probability that a history drawn from the uniform distribution contains two or more equal queries 
can be bounded by 0(T 2 /S) and thus we arrive at Eq. (52). Therefore in the limit S — > oo the 
acceptance probability is at least 2/3 if Q is drawn from the uniform distribution. But this implies 
that the sampling algorithm A s accepts p with probability at least 2/3. □ 
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6.2 Reduction from the Collision Problem to testing Orthogonality 

One can get lower bounds on the query complexity of testing Orthogonality using the lower bounds 
for the Collision problem [3]. Indeed, let H : [N] — > [3iV/2] be an oracle function such that either 
H is one-to-one (yes-instance) or H is two-to-one (no-instance). The Collision Problem is to decide 
which one is the case. It was shown by Aaronson and Shi [3] that the quantum query complexity 
of the Collision problem is f^iV 1 / 3 ). Below we show that the Collision problem can be reduced to 
testing Orthogonality 3 . It implies that testing Orthogonality requires Q,(N 1 / 2 ) queries classically 
and f2(iV 1//3 ) queries quantumly. 

Indeed, choose a random permutation a : [N] — > [N] and define functions O p ,O q : [N/2] — ► 
[3N/2] by restricting the composition H o a to the subsets of odd and even integers respectively: 

O p (s) = H(a(2s-l)), O q {s) = H(*(2s)), s £ [N/2]. 

For any yes- instance (i.e. H is one-to-one), the distributions p,q £ T> SN / 2 generated by O p and O q 
are uniform distributions on some pair of disjoint subsets of [3JV/2]; that is, p and q are orthogonal. 

We need to show that for any no-instance (H is two-to-one) the distance \\p — q\\\ takes values 
smaller than 2 — e with a sufficiently high probability for some constant e. 

Lemma 10. Let H : [N] — > [3AT/2] be any two-to-one function. Let a : [N] — > [N] be a random 
permutation drawn from the uniform distribution. Then 



Pr 



I II <r 7 



1 

> -. 

~ 2 



Proof. Given the promise on H we can define a perfect matching M on the set [N] (considered 
as a complete graph with N vertices) such that H(u) = H{v) iff u and v are matched. Let 
M a = o" -1 o Clearly, Ai a is a random perfect matching on [N] drawn from the uniform 
distribution on the set of all perfect matchings. Let (u, v) G M. c be some pair of matched vertices 
and w = H(o(u)) = H(a(v)). Note that if u and v have different parity then p w = q w = 2/N. On 
the other hand, if u and v have the same parity then p w = 4/iV, q w = or vice verse. Thus 

4 

\\p — q\\i = 2 — —#{(u, v) e M a : u and v have different parity}. (53) 

A nice property of the uniform distribution on the set of perfect matchings on [N] is that a 
conditional distribution given that (u, v) G M a is the uniform distribution on the set of perfect 
matchings on [N]\{u,v}. Thus we can generate M a using the following algorithm. Let U C [N] 
be the set of all unpaired vertices (in the beginning U = [N]). Let U even and U <m be the subsets 
of all even and all odd integers in U. The algorithm starts from an empty matching Ai a = 0. 
Suppose at some step of the algorithm we have some matching M a and some sets of unpaired 
vertices U = U even U U odd . If \U even \ > \U odd \ choose a random vertex u £ U od d- If \U even \ < \U odd \ 
choose a random vertex u £ U even . Pair u with a random vertex v £ U\{u} and update 

M a ^ M a l>{u,v}, U^U\{u,v} 

with the corresponding update for U even and U odd . After N/2 steps of the algorithm we generate a 
random uniform M a . 



3 In order to apply the lower bound proved in [3] one has to choose the range of H of size 3-/V/2 rather than N 
which would be more natural. 
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By construction, at each step of the algorithm we pair a vertex u to a vertex v with the opposite 
parity with probability at least 1/2. Thus the probability P(k) of having a matching M. a with less 
than k pairs having opposite parity is 

k 

P(k) < ( N/2 ^j2-^ +k < 2t[*(*)+*-i+°(i)], 
i=o ^ 1 ' 

where x = 2k/N. One can check that H(x) + x - 1 < for x < 1/8 and thus P(N/16) < 1/2 for 
sufficiently large N. Thus Eq. (53) implies that \\p — q\\i < 1 — 1/4 = 7/4 with probability at least 
1/2. □ 



6.3 Classical lower bound for testing Uniformity 

In this section we prove that classically testing Uniformity requires ^(iV 1 / 2 ). A proof uses the 
machinery developed by Valiant in [19]. Valiant's techniques apply to testing symmetric properties 
of distributions, that is, properties that are invariant under relabeling of elements in the domain of 
a distribution. Clearly, Uniformity is a symmetric property. 

We shall need two technical tools from [19], namely, the Positive-Negative Distance lemma and 
Wishful Thinking theorem (see Theorem 4 and Lemma 3 in [19]). Let us start from introducing 
some notations. Let p £ T>n be an unknown distribution and S = (ii, ■ ■ ■ ,im) be a list of M 
independent samples drawn from p. We shall say that S has a collision of order r iff some element 
i G [N] appears in S exactly r times. Let c r be the total number of collisions of order r, where r > 1. 
A sequence of integers {c r } r >i is called a fingerprint of S. Define a probability distribution on 
a set of fingerprints as follows: (1) draw k from the Poisson distribution Poi(fc) = e~ M M k jk\. (2) 
Generate a list S of k independent samples drawn from p. (3) Output a fingerprint of S. 

An important observation made in [19] is that a fingerprint contains all relevant information 
about a sample list as far as testing symmetric properties is concerned. Thus without loss of 
generality, a testing algorithm has to make its decision by looking only on a fingerprint of a sample 
list. Applying Positive-Negative Distance lemma from [19] to testing Uniformity we get the following 
result. 

Lemma 11 ([19]). Let u be the uniform distribution on [N] and p 6 Vjy be any distribution such 
that \\p — u\\i > 1. If for some integer M 

ll<-^ M ||i<^ (54) 
then Uniformity is not testable in M samples. 

The second technical tool is a usable upper bound on the distance between the distributions of 
fingerprints. For any integer k define an k-th. moment of p as 

N 

m k (p)=J2p k - (55) 
i=i 

Clearly rrik{u) = A fl_fe which is the smallest possible value of a k-th moment for distributions 
on [N]. Applying Wishful Thinking theorem from [19] to testing Uniformity we get the following 
result. 
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Lemma 12 ([19]). Let p £ V N be any distribution such that ||p||oo 

< S/M for some 5 > 0. Then 
T l-k 

Corollary 1. Uniformity is not testable classically in 32 _1 N 1 / 2 queries. 
Proof. Consider a distribution 



\ d m - D M h< m + 10 y M k — m k{ p) - N l - k = 

P td, \kl2\\Jl + M k m h (v) 



Pi 

Clearly \\p — u\\± = 1 and 



2/N if 1 < i < N/2, 
otherwise. 



m k (p) =2 k - 1 N 1 - k . 
In particular, choosing M = 2~ a N 1 / 2 we have 

M k m k {p) = 2- fe ( fl - 1 )- 1 N 1 -^ < 2~ 2a+1 for all k > 2. 

Taking into account that 

gw £2(e - 1)£4 

we can use Eq. (56) to infer that 

\\Dp - Id < 405 + 10 • 2- 2a+3 . (57) 

Clearly condition ||p||oo < S/M can be satisfied for any constant S > and sufficiently large N. 
Then Lemma 11 implies that Uniformity is not testable in M samples whenever 10-2 _2a+3 < 1/12. 
It suffices to choose a = 5. Finally Lemma 9 implies that Uniformity is not testable in M queries 
in the oracle model. □ 
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